problems with a sub-select (takes donkeys' years) - Mailing list pgsql-general
From | Stuart Rison |
---|---|
Subject | problems with a sub-select (takes donkeys' years) |
Date | |
Msg-id | v04003a05b325562e3c99@[128.40.242.190] Whole thread Raw |
Responses |
Re: [GENERAL] problems with a sub-select (takes donkeys' years)
|
List | pgsql-general |
Hello, I've got two SELECT statements which work fine (take about 5secs each) when I do them sequentially but take ages (around 5mins) if in a statement with a sub-SELECT. The individual SELECTs are: SELECT p1.brecard_id FROM malignant_pathologies p1, malignant_pathologies p2, malignant_pathologies p3 WHERE p1.code='MAPH' AND p1.brecard_id=p2.brecard_id AND p2.code='AMCA' AND p3.brecard_id=p1.brecard_id AND p3.code='LOCA'; brecard_id ---------------- DSTL12031999016 DPHA12031999017 DCCH12031999056 DUCH12031999059 DUCH12031999063 DCCH12031999077 DUCH12031999098 (7 rows) SELECT brecard_id,count(brecard_id) FROM malignant_pathologies WHERE brecard_id IN ('DSTL12031999016','DPHA12031999017','DCCH12031999056','DUCH12031999059','DUCH12 031999063','DCCH12031999077','DUCH12031999098') GROUP BY brecard_id; brecard_id |count ----------------+----- DCCH12031999056 | 10 DCCH12031999077 | 12 DPHA12031999017 | 11 DSTL12031999016 | 11 DUCH12031999059 | 13 DUCH12031999063 | 6 DUCH12031999098 | 14 (7 rows) (This is just an example query, what I'm trying to do here is use the list of brecard_id's generated by the previous query). Both of these execute in about 5 seconds (on 1000 rows) When I put them together as: SELECT brecard_id,count(brecard_id) FROM malignant_pathologies WHERE brecard_id IN ( SELECT p1.brecard_id FROM malignant_pathologies p1, malignant_pathologies p2, malignant_pathologies p3 WHERE p1.code='MAPH' AND p1.brecard_id=p2.brecard_id AND p2.code='AMCA' AND p3.brecard_id=p1.brecard_id AND p3.code='LOCA'); It takes around 5mins for the query to complete! The EXPLAIN for the quey is: NOTICE: QUERY PLAN: Aggregate (cost=4.27 size=0 width=0) -> Group (cost=4.27 size=0 width=0) -> Sort (cost=4.27 size=0 width=0) -> Seq Scan on malignant_pathologies (cost=4.27 size=99 width=12) SubPlan -> Nested Loop (cost=8.27 size=1 width=36) -> Nested Loop (cost=6.27 size=1 width=24) -> Seq Scan on malignant_pathologies p3 (cost=4.27 size=1 width=12) -> Index Scan using malignant_pathologies_pkey on malignant_pathologies p1 (cost=2.00 size=1 width=12) -> Index Scan using malignant_pathologies_pkey on malignant_pathologies p2 (cost=2.00 size=1 width=12) EXPLAIN [Which is not different from putting together the EXPLAINs from each of the individual queries] Can anyone explain why the sub-query form takes so long? Regards, Stuart. +-------------------------+--------------------------------------+ | Stuart Rison | Ludwig Institute for Cancer Research | +-------------------------+ 91 Riding House Street | | Tel. (0171) 878 4041 | London, W1P 8BT, UNITED KINGDOM. | | Fax. (0171) 878 4040 | stuart@ludwig.ucl.ac.uk | +-------------------------+--------------------------------------+
pgsql-general by date: